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Abstract 

We  present  simple  algorithms  for  achieving  self-stabilizing  location  management  and  routing  in  mobile 
ad-hoc  networks.  While  mobile  clients  may  be  susceptible  to  corruption  and  stopping  failures,  mobile 
networks  are  often  deployed  with  a  reliable  GPS  oracle,  supplying  frequent  updates  of  accurate  real  time 
and  location  information  to  mobile  nodes.  Information  from  a  GPS  oracle  provides  an  external,  shared 
source  of  consistency  for  mobile  nodes,  allowing  them  to  label  and  timestamp  messages,  and  hence  aiding 
in  identification  of,  and  eventual  recovery  from,  corruption  and  failures.  Our  algorithms  use  a  GPS  oracle. 

Our  algorithms  also  take  advantage  of  the  Virtual  Stationary  Automata  programming  abstraction, 
consisting  of  mobile  clients,  virtual  timed  machines  called  virtual  stationary  automata  (VS As),  and  a 
local  broadcast  service  connecting  VSAs  and  mobile  clients.  VSAs  are  distributed  at  known  locations 
over  the  plane,  and  emulated  in  a  self-stabilizing  manner  by  the  mobile  nodes  in  the  system.  They  serve 
as  fault-tolerant  building  blocks  that  can  interact  with  mobile  clients  and  each  other,  and  can  simplify 
implementations  of  services  in  mobile  networks. 

We  implement  three  self-stabilizing,  fault-tolerant  services,  each  built  on  the  prior  services:  (1)  VSA- 
to-VSA  geographic  routing,  (2)  mobile  client  location  management,  and  (3)  mobile  client  end-to-end 
routing.  We  use  a  greedy  version  of  the  classical  depth-first  search  algorithm  to  route  messages  between 
VSAs  in  different  regions.  The  mobile  client  location  management  service  is  based  on  home  locations : 
Each  client  identifier  hashes  to  a  set  of  home  locations,  regions  whose  VSAs  are  periodically  updated  with 
the  client’s  location.  VSAs  maintain  this  information  and  answer  queries  for  client  locations.  Finally,  the 
VSA-to-VSA  routing  and  location  management  services  are  used  to  implement  mobile  client  end-to-end 
routing. 

Keywords:  virtual  infrastructure,  location  management,  home  locations,  end-to-end  routing,  hash  func¬ 
tions,  self-stabilization,  GPS  oracle 
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1  Introduction 

A  system  with  no  fixed  infrastructure  in  which  mobile  clients  may  wander  in  the  plane  and  assist  each 
other  in  forwarding  messages  is  called  an  ad-hoc  network.  The  task  of  designing  algorithms  for  constantly 
changing  networks  is  difficult.  Highly  dynamic  networks,  however,  are  becoming  increasingly  prevalent, 
especially  in  the  context  of  pervasive  and  ubiquitous  computing,  and  it  is  therefore  important  to  develop 
and  use  techniques  that  simplify  this  task.  In  addition,  mobile  nodes  in  these  networks  may  suffer  from 
crash  failures  or  corruption  faults,  which  cause  arbitrary  changes  to  their  program  states.  Self-stabilization 
[4,  5]  is  the  ability  to  recover  from  an  arbitrarily  corrupt  state.  This  property  is  important  in  long-lived, 
chaotic  systems  where  certain  events  can  result  in  unpredictable  faults.  For  example,  transient  interference 
may  disrupt  wireless  communication,  violating  our  assumptions  about  the  broadcast  medium. 

Mobile  networks  are  often  deployed  in  conjunction  with  “reliable”  GPS  services,  supplying  frequent 
updates  of  real  time  and  region  information  to  mobile  nodes.  While  the  mobile  clients  may  be  susceptible  to 
corruption  and  stopping  failures,  the  GPS  service  may  not  be.  Each  of  our  algorithms  utilizes  such  a  reliable 
GPS  oracle.  Information  from  this  oracle  provides  an  external,  shared  source  of  consistency  for  mobile  nodes, 
allowing  them  to  label  and  timestamp  their  messages,  and  hence,  aiding  in  identification  of,  and  recovery 
from,  corruption  and  stopping  failures. 

In  this  paper  we  describe  self-stabilizing  algorithms  that  use  a  reliable  GPS  oracle  to  provide  geographic 
routing,  a  mobile  client  location  management  service,  and  a  mobile  client  end-to-end  routing  service.  Each 
service  is  built  on  the  prior  services  such  that  the  composition  of  the  services  remains  self-stabilizing  [11]. 
In  order  to  route  location  information  between  geographic  regions,  we  use  a  greedy  version  of  the  classical 
depth- first  search  algorithm.  This  service  is  then  used  to  help  implement  the  location  management  service; 
each  mobile  client  identifier  hashes  to  a  set  of  home  locations,  geographical  regions  that  are  periodically 
updated  with  the  location  of  the  client,  and  that  are  responsible  for  then  answering  queries  about  the 
client’s  location.  Both  of  these  services  are  then  used  to  implement  point-to-point  routing  between  mobile 
clients  in  the  network. 

In  order  to  simplify  the  implementations  of  the  location  management  and  routing  services,  we  mask  the 
unpredictable  behavior  of  mobile  nodes  by  using  a  self-stabilizing  virtual  infrastructure,  consisting  of  mobile 
client  automata,  timing-aware  and  location-aware  machines  at  fixed  locations,  called  Virtual  Stationary 
Automata  (VSAs)  [8,  9],  that  mobile  clients  can  interact  with  and  use  to  coordinate  their  actions,  and  a  local 
broadcast  service  connecting  VSAs  and  mobile  clients. 

Self-stabilization  and  GPS  oracles.  Traditionally,  studies  of  self-stabilizing  systems  are  concerned  with 
those  systems  that  can  be  started  from  arbitrary  configurations  and  eventually  regain  consistency  without 
external  help.  However,  mobile  clients  often  have  access  to  some  reliable  external  information  from  a  service 
such  as  GPS.  Each  of  our  algorithms  in  this  paper  uses  an  external  GPS  service  (or  an  equivalent  service) 
as  a  reliable  GPS  oracle,  providing  periodic  real  time  clock  and  location  updates,  to  base  stabilization  upon; 
our  algorithms  use  timestamps  and  location  information  to  tag  events.  In  an  arbitrary  state,  recorded 
events  may  have  corrupted  timestamps.  Corrupted  timestamps  indicating  future  times  can  be  identified 
and  reset  to  predefined  values;  new  events  receive  newer  timestamps  than  any  in  the  arbitrary  initial  state. 
This  eventually  allows  nodes  in  the  system  to  totally  order  events.  We  use  the  eventual  total  order  to 
provide  consistency  of  information  and  distinguish  between  incarnations  of  activity  (such  as  retransmissions 
of  messages). 

Virtual  Stationary  Automata  programming  layer.  In  prior  work  [8,  7,  6],  we  developed  a  notion  of 
“virtual  nodes”  for  mobile  ad  hoc  networks.  A  virtual  node  is  an  abstract,  relatively  well-behaved  active  node 
that  is  implemented  using  less  well-behaved  real  physical  nodes.  The  GeoQuorums  algorithm  [7]  proposes 
storing  data  at  fixed  locations;  however  it  only  supports  atomic  objects,  rather  than  general  automata. 
A  more  general  virtual  mobile  automaton  is  suggested  in  [6].  Finally,  the  virtual  automata  presented  in 
[8,  9]  (and  used  here)  are  more  powerful  than  those  of  [6],  providing  timing  capabilities  needed  for  many 
applications.  These  automata  are  stationary  and  arranged  in  a  connected  pattern  similar  to  that  of  a 
traditional  wired  network. 

The  static  infrastructure  we  use  in  this  paper  includes  fixed,  timed  virtual  machines  with  an  explicit  notion 
of  real  time,  called  Virtual  Stationary  Automata  (VSAs),  distributed  at  known  locations  over  the  plane  [8,  9]. 
Each  VSA  represents  a  predetermined  geographic  area  and  has  broadcast  capabilities  similar  to  those  of  the 
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mobile  physical  nodes,  allowing  nearby  VSAs  and  mobile  nodes  to  communicate  with  one  another.  Many 
algorithms  depend  significantly  on  timing,  and  it  is  reasonable  to  assume  that  many  mobile  nodes  have  access 
to  reasonably  synchronized  clocks.  In  the  VSA  layer,  VSAs  also  have  access  to  virtual  clocks,  guaranteed 
to  not  drift  too  far  from  real  time.  The  layer  provides  mobile  nodes  with  a  fixed  virtual  infrastructure, 
reminiscent  of  more  traditional  and  better  understood  wired  networks,  with  which  to  coordinate  their  actions. 

Our  clock-enabled  VSA  layer  is  emulated  by  physical  mobile  nodes  in  the  network.  Each  physical  node  is 
periodically  informed  its  region  by  the  GPS.  A  VSA  for  a  particular  region  is  then  emulated  by  a  subset  of 
the  mobile  nodes  in  its  region:  the  VSA  state  is  maintained  in  the  memory  of  the  physical  nodes  emulating 
it,  and  the  physical  nodes  perform  VSA  actions  on  behalf  of  the  VSA.  If  no  physical  nodes  are  in  the  region, 
the  VSA  fails;  if  physical  nodes  later  arrive,  the  VSA  restarts. 

An  important  property  of  the  VSA  layer  implementation  described  in  [8,  9]  is  that  it  is  self-stabilizing. 
Corruption  failures  at  physical  nodes  can  result  in  inconsistency  in  the  emulation  of  a  VSA.  Our  implemen¬ 
tation,  however,  can  recover  after  corruptions  to  correctly  emulate  a  VSA.  To  algorithms  run  on  the  VSA 
layer,  the  VSA  simply  appears  to  suffer  from  a  corruption. 

Geographic/  VSA-to-VSA  routing.  A  basic  service  running  on  the  VSA  layer  that  we  describe  and 
use  repeatedly  is  that  of  VSA-to-VSA  (region-to-region)  routing  (VtoVComm),  providing  a  form  of  geocast. 
GeoCast  algorithms  [24,  3],  GOAFR  [19],  and  algorithms  for  “routing  on  a  curve”  [23]  route  messages 
based  on  the  location  of  the  source  and  destination,  using  geography  to  delivery  messages  efficiently.  GPSR 
[17],  AFR  [20],  GOAFR+  [19],  polygonal  broadcast  [10],  and  the  asymptotically  optimal  algorithm  [20] 
are  algorithms  based  on  greedy  geographic  routing  algorithms,  forwarding  messages  to  the  neighbor  that  is 
geographically  closest  to  the  destination.  The  algorithms  also  address  “local  minimum  situations”,  where  the 
greedy  decision  cannot  be  made.  GPSR,  GOAFR+,  and  AFR  achieve,  under  reasonable  network  behavior,  a 
linear  order  expected  cost  in  the  distance  between  the  sender  and  the  receiver.  We  implement  VSA-to-VSA 
routing  using  a  persistent  greedy  depth-first  search  (DFS)  routing  algorithm  that  runs  on  top  of  the  VSA 
layer’s  fixed  infrastructure.  Our  scheme  is  an  application  of  the  classical  DFS  algorithm  in  a  new  setting. 
Location  management.  Finding  the  location  of  a  moving  client  in  an  ad-hoc  network  is  difficult,  much 
more  so  than  in  cellular  mobile  networks  where  a  fixed  infrastructure  of  wired  support  stations  exist  (as  in 
[16]),  or  in  sensor  networks  where  some  approximation  of  a  fixed  infrastructure  may  exist  [2].  A  location 
service  in  ad-hoc  networks  is  a  service  that  allows  any  client  to  discover  the  location  of  any  other  client 
using  only  its  identifier.  The  basic  paradigm  for  location  services  that  we  use  here  is  that  of  a  home  location 
service:  Hosts  called  home  location  servers  are  responsible  for  storing  and  maintaining  the  location  of  other 
hosts  in  the  network  [1,  14,  21].  Several  ways  to  determine  the  sets  of  home  location  servers,  both  in  the 
cellular  and  entirely  ad-hoc  settings,  have  been  suggested. 

The  locality  aware  location  service  (LLS)  in  [1]  for  ad-hoc  networks  is  based  on  a  hierarchy  of  lattice 
points  for  destination  nodes,  published  with  locations  of  associated  nodes.  Lattice  points  can  be  queried 
for  the  desired  location,  with  a  query  traversing  a  spiral  path  of  lattice  nodes  increasingly  distant  from  the 
source  until  it  reaches  the  destination.  Another  way  of  choosing  location  servers  is  based  on  quorums.  A  set 
of  hosts  is  chosen  to  be  a  write  quorum  for  a  mobile  client  and  is  updated  with  the  client’s  location.  Another 
set  is  chosen  to  be  a  read  quorum  and  queried  for  the  desired  client  location.  Each  write  and  read  quorum 
has  a  nonempty  intersection,  guaranteeing  that  if  a  read  quorum  is  queried,  the  results  will  include  the  latest 
location  of  the  client  written  to  a  write  quorum.  In  [14],  a  uniform  quorum  system  is  suggested,  based  on  a 
virtual  backbone  of  quorum  representatives.  Geographic  quorums  based  on  the  focal  points  abstraction  are 
suggested  in  [7]. 

Location  servers  can  also  be  chosen  using  a  hash  table.  Some  papers  [21,  15,  25]  use  geographic  locations 
as  a  repository  for  data.  These  use  a  hash  to  associate  each  piece  of  data  with  a  region  of  the  network  and 
store  the  data  at  certain  nodes  in  the  region.  This  data  can  then  be  used  for  routing  or  other  applications. 
The  Grid  location  service  (GLS)  [21]  maps  client  ids  to  geographic  coordinates.  A  client  Cp s  location  is 
saved  by  clients  closest  to  the  coordinates  p  hashes  to. 

The  location  managment  scheme  we  present  here  is  based  on  the  hash  table  concept  and  built  on  top  of 
the  VSA  layer  and  VSA-to-VSA  routing  service.  VSAs  and  mobile  clients  are  programmed  to  form  a  self- 
stabilizing,  fault-tolerant  distributed  data  structure  for  location  management,  where  VSAs  serve  as  home 
locations  for  mobile  clients.  Each  client’s  id  hashes  to  a  VSA  region,  the  client’s  home  location,  whose  VSA 
is  responsible  for  maintaining  the  location  of  the  client.  Whenever  a  client  node  Cp  would  like  to  locate 
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System  constants: 

R ,  a  fixed  closed  connected  region  of  the  2-D  plane. 
U,  a  finite  set  of  ids  for  subregions  of  R. 
m ,  the  size  of  U. 

region ,  a  mapping  from  U  to  connected  subsets  of  R. 
nbrs ,  a  symmetric  relation  between  ids  in  U. 
r virt-,  the  supremum  distance  between  points  in  u 

and  v  for  any  regions  u,  v  where  u  G  nbrs(v). 
P,  a  finite  set  of  client  node  ids  where  P  D  U  =  0. 
Vmax ,  the  maximum  client  node  speed. 


e sample >  the  GPS  sample  period. 

d,  the  broadcast  message  delay. 

e,  the  delay  factor  for  VS  A  outputs. 
ttlytoV  >  d,  the  VtoVComm  message  delay. 
WsAcor j  the  VSA  stabilization  time. 

System  variables: 

now  Gl,  a  clock  variable,  representing  real  time. 
loc,  a  continuously  updated  array  of  locations  in  R 
of  mobile  nodes,  indexed  by  node  id. 


Figure  1:  System  constants  and  variables. 


another  client  node  Cq,  Cp  would  compute  the  home  location  of  Cq  by  applying  a  predefined  global  hash 
function  to  Cqs  id,  and  query  the  region  represented  by  the  result  of  that  hash  for  Cq  s  location.  In  order 
for  our  scheme  to  tolerate  crash  failures  of  a  limited  number  of  VSAs,  each  mobile  client  id  actually  maps 
to  a  set  of  VSA  home  locations;  the  hash  function  returns  a  sequence  of  region  ids  as  the  home  locations. 
We  can  use  any  hash  function  that  provides  a  sequence  of  region  identifiers;  one  possibility  is  a  permutation 
hash  function,  where  permutations  of  region  ids  are  lexicographically  ordered  and  indexed  by  client  id. 
End-to-end  routing.  Another  basic,  but  difficult  to  provide,  service  in  mobile  networks  is  end-to-end 
routing.  Our  self-stabilizing  implementation  of  a  mobile  client  end-to-end  communication  service  is  simple, 
given  VSA-to-VSA  routing  and  the  home  location  service.  A  client  sends  a  message  to  another  client  by 
using  the  home  location  service  to  discover  the  destination  client’s  region  and  then  has  a  local  VSA  forward 
the  message  to  the  region  using  the  VSA-to-VSA  service. 

Paper  organization.  The  rest  of  the  paper  is  organized  as  follows:  The  system  model  and  the  virtual 
automata  layer  are  described  in  the  next  section.  In  Section  3  we  describe  the  problem  specifications  we 
are  interested  in.  Section  4  describes  the  VSA-to-VSA  communication  implementation.  In  Section  5  we 
descibe  the  implementation  of  the  home  location  service.  In  Section  6  we  present  the  implementation  of  the 
end-to-end  routing  service.  Concluding  remarks  appear  in  Section  7. 

2  Datatypes  and  system  model 

The  system  consists  of  a  2-D  bounded  region  plane,  where  broadcast-enabled,  GPS-updated  mobile  client 
nodes  are  deployed.  We  assume  the  Virtual  Stationary  Automata  programming  abstraction  [8],  which  in¬ 
cludes  both  the  mobile  client  nodes  and  virtual  stationary  automata  (VSAs)  the  real  nodes  emulate,  as  well 
as  a  local  broadcast  service,  V-bcast,  between  them  (see  Figure  2).  In  this  section  we  formally  describe  the 
system,  including:  (1)  the  network  tiling,  (2)  the  model  for  the  GPS-augmented  mobile  clients  deployed  in 
the  network,  (3)  the  model  for  the  virtual  nodes  deployed  in  the  network,  and  (4)  the  specification  for  the 
local  broadcast  service  in  the  network.  A  summary  table  of  datatypes,  constants,  and  variables  is  in  Figure 
1. 

2.1  Network  tiling 

The  deployment  space  of  the  network  is  assumed  to  be  a  fixed,  closed,  and  bounded  connected  region  of 
the  2-D  plane  called  R.  R  is  partitioned  into  known  connected  subregions  called  regions,  with  unique  ids 
drawn  from  the  set  of  region  identifiers  U.  In  practice  it  may  be  convenient  to  restrict  regions  to  be  regular 
polygons  such  as  squares  or  hexagons.  We  define  a  neighbor  relation  nbrs  on  ids  from  U.  This  relation  holds 
for  any  two  region  identifiers  u  and  v  where  the  supremum  distance  between  points  in  u  and  v  is  bounded 
by  a  constant  rwt. 

2.2  Client  nodes 

For  each  p  in  the  set  of  physical  node  identifiers  P,  we  assume  a  mobile  timed  I/O  automaton  client  Cp, 
whose  location  in  R  at  any  time  is  referred  to  as  loc(p).  Mobile  client  speed  is  bounded  by  a  constant  vmax. 
Clients  receive  region  and  time  information  from  the  GPS  oracle.  A  GPSupdate(iq  now)p  happens  every 
e sample  time  at  each  client  Cp,  indicating  to  the  client  the  region  u  where  it  is  currently  located  and  the 
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current  time  now.  Clients  accept  this  now  real-time  clock  variable  as  the  value  of  their  own  local  clock.  For 
simplicity,  this  local  variable  progresses  at  the  rate  of  real  time.  This  implies  that,  outside  of  failures,  the 
local  value  of  now  will  equal  real  time. 

Each  client  Cp  is  equipped  with  a  local  broadcast  service  V-bcast  (see  Section  2.4),  allowing  it  to  com¬ 
municate  with  its  and  neighboring  regions’  VSAs  and  clients  with  bcast(m)p  and  brcv(m)p. 

Clients  are  susceptible  to  stopping  and  corruption  failures.  After  a  stopping  failure,  a  client  performs  no 
additional  local  steps  until  restarted.  If  restarted,  it  starts  again  from  an  initial  state.  If  a  node  suffers  from 
a  corruption,  it  experiences  a  nondeterministic  change  to  its  program  state. 

Additional  arbitrary  external  interface  actions  and  local  state  used  by  algorithms  running  at  the  client 
are  allowed.  For  simplicity  local  steps  are  assumed  to  take  no  time. 


2.3  Virtual  Stationary  Automata  (VSAs) 

Here  we  describe  VSAs;  a  self-stabilizing  implementation  of  such  machines  using  a  GPS  oracle  and  the 
physical  mobile  nodes  in  the  system  can  be  found  in  [8,  9].  An  abstract  VSA  is  a  timing-enabled  virtual 
machine  that  may  be  emulated  by  the  physical  mobile  nodes  in  its  region  in  the  network.  We  formally 
describe  a  timed  machine  for  region  u,  Vu,  as  a  TIOA  whose  program  is  a  tuple  of  its  action  signature,  sigu, 
valid  states,  statesu,  a  start  state  function  mapping  clock  values  to  start  states,  startu ,  a  discrete  transition 
function,  8Ul  and  a  set  of  valid  trajectories,  tu.  Trajectories  [18]  describe  state  evolution  over  intervals  of 
time.  The  state  of  Vu  is  referred  to  collectively  as  vstate  and  is  assumed  to  include  a  variable  corresponding 
to  real  time,  vstate.now. 


To  guarantee  that  we  can  emulate  a  VSA  using  physical  mobile  nodes,  its  interface  must  be  emulatable 
by  the  nodes.  Hence,  a  VSA  I4’s  external  interface  is  restricted  to  be  similar,  including  only  stopping  failure, 
corruption,  and  restart  inputs,  and  the  ability  to  broadcast  and  receive  messages.  Corruption  failures  result 
in  a  nondeterministic  change  to  vstate. 

Since  a  VSA  is  emulated  by  physical  nodes  (cor¬ 
responding  to  clients)  in  its  region,  its  failures  are 
defined  in  terms  of  client  failures  in  its  region:  (1)  If 
no  clients  are  in  the  region,  the  VSA  is  crashed,  (2)  If 
no  failures  of  clients  (corruption  or  stopping)  occurs 
in  an  alive  VSA’s  region  over  some  interval,  the  VSA 
does  not  suffer  a  failure  during  that  interval,  and  (3) 

A  VSA  may  suffer  a  corruption  only  if  a  mobile  client 
in  its  region  suffers  a  corruption;  the  self-stabilizing 
implementation  of  a  VSA  in  [8,  9]  guarantees  that 
within  tysAcor  of  an  arbitrary  configuration  of  the 
emulation,  the  emulation’s  external  trace  will  look 
like  that  of  the  abstract  VSA,  starting  from  a  cor¬ 
rupted  abstract  state. 

While  an  emulation  of  Vu  would  ideally  be  iden¬ 
tical  to  a  legitimate  execution  of  Vu,  an  abstraction 
must  reflect  that,  due  to  message  delays  or  node  fail¬ 
ure,  the  emulation  might  be  behind  real  time,  ap¬ 
pearing  to  be  delayed  in  performing  outputs  by  up 
to  some  time  e.  The  emulation  is  then  a  delay- 
augmented  TIOA ,  an  augmentation  of  Vu  with  tim¬ 
ing  perturbations,  represented  with  buffers  Dout[e]u, 
composed  with  I4’s  outputs.  The  buffer  delays  mes¬ 
sages  by  a  nondeterministic  time  [0,  e] ,  where  e  is 

more  than  V-bcast’s  broadcast  delay,  d  (see  Section  2.4).  Programs  must  take  into  account  e,  as  they  do  d. 


Figure  2:  Virtual  Stationary  Automata  layer.  VSAs 
and  clients  communicate  locally  using  V-bcast.  VSA 
outputs  may  be  delayed  in  Dout. 


2.4  Local  broadcast  service  (V-bcast) 

Communication  is  in  the  form  of  local  broadcast  service  V-bcast,  with  broadcast  radius  rVirt  and  message 
delay  d.  It  allows  communication  between  VSAs  and  clients  in  the  same  or  neighboring  regions.  The  service 
allows  the  broadcasting  and  receiving  of  message  m  at  each  port  i  £  PU  U  through  bcast(?n)i  and  brcv(m)i. 
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We  assume  that  V-bcast  guarantees  two  properties  between  VSAs  and  between  VSAs  and  clients:  in¬ 
tegrity  and  reliable  local  delivery.  Integrity  guarantees  that  for  any  brcv(m)j  that  occurs,  a  bcast(m)j,  j  € 
PUU  previously  occurred.  Reliable  local  delivery  roughly  guarantees  that  a  transmission  will  be  received  by 
nearby  ports:  If  port  i,  where  i  is  a  client  or  VSA  port  in  any  region  u,  transmits  a  message,  then  every  port 
j,  whether  a  client  or  VSA  port,  in  region  u  or  neighboring  regions  during  the  entire  time  interval  starting 
at  transmission  and  ending  d  later  receives  the  message  by  the  end  of  the  interval.  (For  this  definition,  due 
to  GPSupdate  lag,  a  client  is  still  said  to  be  “in”  region  u  even  if  it  has  just  left  region  u  but  has  not  yet 
received  a  GPSupdate  with  the  change.) 

In  practice,  a  broadcast  service  has  bounded  buffers.  We  assume  buffers  are  large  enough  that  overflows 
do  not  occur  in  normal  operation.  In  the  event  of  overflow,  overflow  messages  are  lost. 

3  Problem  specifications 

We  describe  the  services  we  will  build  over  the  VSA  layer:  VSA-to-VSA  routing,  a  location  service,  and 
client-to-client  routing,  and  describe  our  requirement  that  implementations  be  self-stabilizing. 

The  following  constants  (explained/usecl  shortly)  are  globally  known:  (1)  /  <  m,  a  limit  on  “home 
location”  VSA  failures  for  a  client,  (2)  h,  a  function  mapping  each  client  id  to  a  sequence  of  /  +  1  distinct 
region  ids,  (3)  ttlytoV  >  d,  delivery  time  for  the  VtoVComm  service,  (4)  UIhls  >  esampie  +  ‘2d+3e+2UlvtoV , 
response  time  of  the  location  management  service,  and  (5)  ttlhb ,  a  refresh  period.  We  assume  the  following 
client  mobility  and  VSA  crash  failure  conditions: 

(1)  Each  client  spends  at  least  esampie  time  in  a  region  before  moving  to  another  region, 

(2)  At  any  time,  each  alive  client’s  current  region  or  a  neighboring  region  has  a  non-crashed  VSA  that 
remains  alive  for  an  additional  UIhls  time, 

(3)  For  any  interval  of  length  ttlytoV  +  e,  two  VSAs  alive  over  the  interval  are  connected  via  at  least  one 
path  of  non-crashed  VSAs  over  the  entire  interval,  and 

(4)  For  any  interval  of  length  ttlhb  +  2 ttlytoV  +  2 e  +  d,  and  any  alive  client  q,  at  least  one  VSA  from  h(q) 
does  not  crash  during  the  interval. 

3.1  VSA-to-VSA  communication  service  (VtoVComm)  specification 

The  first  service  is  an  inter-VSA  routing  service,  where  a  VSA  from  some  region  u  can  send  a  message  m 
through  VtoVsend(u,  m)u  to  a  VSA  in  another  (potentially  non-neighboring)  region  v.  Region  v’s  VSA  later 
receives  m  through  VtoVr cv(m)„.  The  service  guarantees  two  properties: 

(1)  If  a  VSA  at  region  u  performs  a  VtoVsend(u,  m),  and  both  region  u  and  v  VSAs  are  alive  over  the 
time  interval  beginning  with  the  send  and  ending  ttlytoV  time  later,  then  the  VSA  at  region  v  performs  a 
VtoVrcv(m)  before  the  end  of  the  interval,  and 

(2)  If  a  message  is  received  at  some  VSA,  it  was  previously  sent  to  that  VSA. 

3.2  Location  service  specification 

A  location  service  answers  queries  from  clients  for  the  locations  of  other  clients.  A  client  node  p  can  submit 
a  query  for  a  recent  region  of  client  node  q  via  a  HLquery(<7)p  action.  If  few  home  location  failures  occur  and 
q  has  been  in  the  system  for  a  sufficient  amount  of  time,  the  service  responds  within  bounded  time  with  a 
recent  region  location  of  q ,  qreg,  through  a  HLreply (q,qreg)i  action. 

To  be  more  exact,  the  location  service  guarantees  that  if  a  client  p  performs  a  HLquery  to  find  an  alive 
client  q  that  has  been  in  the  system  longer  than  eaarnpie  +  d  +  ttlytoV  +  e  +  UIhls  time,  and  client  p  does 
not  crash  or  move  to  a  different  region  for  UIhls  time,  then: 

(1)  Within  UIhls  time,  client  p  will  perform  a  HLreply  with  a  region  for  q,  and 

(2)  If  p  performs  a  HLreply(g,  qreg),  then  p  had  requested  g’s  location  and  q  was  either:  (a)  alive  in  region 
qreg  within  the  last  UIhls  time,  or  (b)  failed  for  at  most  ttlhb  +  UIhls  ~  e sample  time. 

3.3  Client  end-to-end  routing  (EtoEComm)  specification 

End-to-end  routing  is  an  important  application  for  ad-hoc  networks.  The  V-bcast  service  provides  a  local 
broadcast  service  where  VSAs  and  clients  can  communicate  with  VSAs  and  clients  in  neighboring  regions. 
VtoVComm  allows  arbitrary  VSAs  to  communicate.  End-to-end  routing  (EtoEComm)  allows  arbitrary 


6 


clients  to  communicate:  a  client  p  sends  message  m  to  client  q  using  send(g,  ?n)p,  which  is  received  by  q  in 
bounded  time  via  receive(?n)9. 

If  clients  p  and  q  do  not  crash  for  Mhls  time,  clients  do  not  change  regions  for  UIhls  time  after  a  send, 
and  q  has  been  in  the  system  at  least  UIhls  +  e sample  +  d  +  ttlytoV  +  e  time,  then: 

(1)  If  client  p  sends  message  m  to  q ,  q  will  receive  m  within  UIhls  +  2d  +  2e  +  ttlvtoV  time,  and 

(2)  Any  message  received  by  a  client  was  previously  sent  to  the  client. 

3.4  Self- stabilizing  implementations 

We  require  implementations  of  the  above  services  to  be  self-stabilizing.  A  system  configuration  is  safe  with 
respect  to  a  specification  and  implementation  if  any  admissible  execution  fragment  of  the  implementation 
starting  from  the  configuration  is  an  admissible  execution  fragment  of  the  specification.  An  implemen¬ 
tation  is  self- stabilizing  if  starting  from  any  configuration,  an  admissible  execution  of  the  implementation 
eventually  reaches  a  safe  configuration.  Notice  that  in  the  presence  of  corruptions,  if  an  implementation  is 
self-stabilizing,  then  any  long  enough  execution  fragment  of  the  implementation  will  eventually  have  a  suffix 
that  looks  like  the  suffix  of  some  correct  execution  of  the  specification,  until  a  corruption  occurs. 

Each  of  the  above  services’  self-stabilizing  implementations  will  be  built  on  top  of  self-stabilizing  im¬ 
plementations  of  other  services:  VtoVComm  over  the  VSA  layer,  the  location  service  over  the  VSA  layer 
and  VtoVComm  service,  and  EtoEComm  over  the  VSA  layer,  VtoVComm,  and  location  services.  Each  self- 
stabilizing  implementation  uses  lower  level  services  without  feedback,  so  lower  level  service  executions  are  not 
influenced  by  the  upper  level  services.  This  allows  us  to  guarantee  that  higher  level  service  implementations 
are  still  self-stabilizing  through  fair  composition  [11]. 

Our  service  implementations,  starting  from  an  arbitrary  system  configuration,  stabilize  within  the  fol¬ 
lowing  times:  VtoVComm:  ttlvtoV  +  d  time  after  the  VSA  layer  stabilizes  ( WsAcor  time),  the  loca¬ 
tion  service:  maxfttlHLS,  2e  +  3 ttlytoV  +  ttlhb  +  2 d)  time  after  VtoVComm  stabilizes,  and  EtoEComm: 
ttlpb  +  2d  +  2e  +  ttlvtoV  time  after  the  location  service  has  stabilized. 


4  VSA  to  VSA  communication  (VtoVComm)  implementation 


The  VtoVComm  service  allows  communication  of 
messages  between  any  two  VSAs  through  VtoVsend 
and  VtoVrcv  actions,  as  long  as  there  is  a  path  of 
non-failed  VSAs  between  them.  The  VtoVComm 
service  is  built  on  top  of  the  V-bcast  service  [8], 
which  supports  communication  between  two  neigh¬ 
boring  VSAs  (see  Figure  3). 

VSA-to-VSA  communication  is  based  on  a 
greedy  DFS  procedure.  When  a  VSA  receives 
a  message  for  which  it  is  not  the  destination,  it 
chooses  a  neighboring  VSA  that  is  on  a  shortest 
path  to  the  destination  VSA  and  forwards  the  mes¬ 
sage  in  a  forward  message  to  that  neighbor.  If  the 
VSA  does  not  receive  an  indication  through  a  found 
message  that  the  message  has  been  delivered  to  the 
destination  within  some  bounded  amount  of  time,  it 
then  forwards  the  message  to  the  neighboring  VSA 
on  the  next  shortest  path  to  the  destination  VSA, 
and  so  on.  This  choice  of  neighbors  is  greedy  in  the 
sense  that  the  next  neighbor  chosen  to  receive  the 
forwarded  message  is  the  one  on  a  shortest  path  to 
the  destination  VSA,  excluding  the  neighbors  as¬ 
sociated  with  previous  tries.  The  greedy  DFS  can 
turn  into  a  flood  in  pathological  situations  in  which 
the  destination  is  that  last  VSA  reached. 


Figure  3:  VSA-to-VSA  communication  (VtoVComm). 
A  VSA  at  region  u  sends  a  message  m  to  region  ids 
VSA  with  a  VtoVsend(i >,m)u.  The  message  is  eventu¬ 
ally  received  at  region  v  by  VtoVrcv(ra)„. 
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Signature: 

Internal  DFStimeout (msg)u 

2 

Input  VtoVsend  (d,  m)u,  d  E  U,m  arbitrary 

Precondition: 

Input  brcv(m)lt,  m  E  ({forward} x  Msgx  Ux  {n}) 

DFStable(msg) .nbrTO  <  now 

4 

U  ({found}  x  Msg) 

V  DFStable(msg) . nbrTO  >  now  +  S(u,  msg.v2vd) 

Output  bcastfm)^,??!  arbitrary 

Effect: 

6 

Output  VtoVrc v(m)u,m  arbitrary 

if  DFStable(msg) .NbrSet  7^  0  then 

Internal  DFSt'\meout(msg)u,  msg  E  Msg 

curNbr  NxtNbr (DFStable(msg). NbrSet, 

8 

Internal  DFSclean (msg)u,  msg  E  Msg 

DFStable(msg) .isrc,  u,  msg.v2vd) 

Msg  =  Mx  Ux  Ux  R,  of  the  form  (m,  v2vs,  v2vd,  ts) 

DFStable(msg) . NbrSet  DFStable(msg) . NbrSet  \{curNbr ] 

10 

bcastq  <—  bcastq  U  {(forward,  msg,  u,  curNbr)} 

State: 

DFStable(msg). nbrTO  <—  now  -\-S(u,msg.v2vd) 

12 

analog  now  E  R,  the  current  real  time 

else  DFStable(msg)  <—  null 

bcastq,  VtoVrcvq,  queues  of  messages,  initially  0 

14 

DFStable,  a  table  indexed  on  message  tuples  in 

Input  brcv( (forward,  msg,  isrc,  u))u 

Msg  with  entries  in  ( nbrs(u )  x  2nbrs(u )  x  R), 

Effect: 

16 

of  the  form  (isrc,  NbrSet,  nbrTO),  initially  0 

if  msg.ts  E  [now  -ttlytoVi  now]  then 

curNbr  E  U,  initially  _L 

if  u  =  msg.v2vd  then 

18 

bcastq  <—  bcastq  U  {(found,  msg)} 

Trajectories: 

VtoVrcvq  «—  VtoVrcvq  U  {msg.m} 

20 

satisfies 

else  if  DFStable(msg)  =  null  then 

d  (now)  =  1 

DFStable(msg)  <—  (isrc,  nbrs(u)\{isrc},  now) 

22 

constant  bcastq,  VtoVrcvq,  DFStable,  curNbr 

stops  when 

Input  brcv((found, msg))u 

24 

Any  precondition  is  satisfied. 

Effect: 

if  DFStable(msg)  7^  null  then 

26 

Actions: 

DFStable(msg)  <—  null 

Output  bcast(m)ti 

if  u  7^  msg.v2vs  then 

28 

Precondition: 

bcastq  <—  bcastq  U  {(found,  msg)} 

m  E  bcastq 

30 

Effect: 

Output  VtoVrcv(ra)ti 

bcastq  <—  bcastq  \  {m} 

Precondition: 

32 

m  E  VtoVrcvq 

Input  VtoVsend  (d,  m)u 

Effect: 

34 

Effect: 

VtoVrcvq  VtoVRcvq  \  {m} 

if  u  =  d  then 

36 

Vto  Vrcvq  <—  Vto  Vrcvq  U  {  m} 

Internal  DFSclean (msg)u 

else  DFStable((m,u,d,now ))  ( u,nbrs(u),now ) 

Precondition: 

DFStable(msg)  7^  null  A  msg.ts  (£  [now  -ttlytQy,  now] 

Effect: 

DFStable(msg)  <—  null 

Figure  4:  Greedy  DFS  algorithm  at  V^toV  for  region  u. 

Self-stabilization  of  the  algorithm  is  ensured  by  the  use  of  a  real-time  timestamp  to  identify  the  version 
of  the  DFS.  Too  old  versions  are  eliminated  from  the  system  and  new  versions  are  handled  as  completely 
new  attempts  to  complete  a  greedy  DFS  towards  the  destination. 

We  first  present  a  simple  greedy  DFS  algorithm  that  gradually  expands  the  search  until  all  paths  are 
checked.  This  algorithm  will  find  a  path  to  the  destination  if  such  a  path  exists  throughout  the  DFS 
execution.  We  also  present  a  modification  of  the  algorithm  to  produce  a  persistent  version  of  the  greedy 
DFS  algorithm  in  which  each  VSA  repeatedly  tries  to  forward  messages  along  previously  unsuccessful  VSA 
paths  to  take  advantage  of  (possibly  temporary)  recoveries  of  VSAs  that  may  result  in  a  viable  path  [13]. 
Again,  the  persistent  greedy  DFS  can  turn  into  a  persistent  flood  in  pathological  situations  in  which  the 
destination  is  the  last  VSA  reached. 

4.1  Detailed  code  description 

The  following  code  description  refers  to  the  code  for  VSA  V^toX  in  Figure  4.  The  main  state  variable 
DFStable  keeps  track  of  information  for  messages  that  are  still  waiting  to  be  delivered.  For  each  such 
unique  message,  the  table  stores  the  intermediate  source  isrc  of  the  message,  the  set  of  VSA  neighbors 
NbrSet  of  neighbors  that  have  yet  to  have  the  message  forwarded  to  them,  and  a  timeout  nbrTO  for  the 
neighbor  currently  being  tried  for  forwarding  the  message. 

A  source  VSA  V^toV  sends  a  message  m  to  a  destination  VSA  in  region  d  using  VtoVsend (d,m)u  (line 
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Internal  DFStimeout(ms(7)li 
Precondition: 

DFStable(msg). nbrTO  <  now  V  DFStable(msg).  nbrTO  >  now  +  8(u,  msg.v2vd ) 

Effect: 

if  DFStable(msg).NbrSet  ^  0  then 

curNbr  <—  NxtNbr (DFStable(msg).NbrSet,  DFStable(msg).isrc,  u,  msg.v2vd) 
DFStable(msg).NbrSet  DFStable(msg).NbrSet\  {curNbr} 
for  each  n  E  nbr(u)  \  DFStable(msg).NbrSet 
bcastq  <—  bcastq  U  {(forward,  msg,  u,  n)} 

DFStable(msg). nbrTO  now  -\-8(u,msg.v2vd) 
else  DFStable(msg)  <—  null 


Figure  5:  The  Persistent  Greedy  DFS  algorithm  at  V^toV  for  region  u  is  the  same  as  the  Greedy  DFS 
algorithm,  except  that  the  broadcast  of  a  DFS  message  to  curNbr  in  the  DFStimeout  action  is  replaced 
with  a  broadcast  to  curNbr  and  all  previously  attempted  neighbors. 


33).  If  u  =  d  then  Vffto1  received  m  through  VtoVrcv(m)ll  (lines  35-36).  Otherwise  the  destination  VSA  is 
another  VSA  and  V^toV  sets  the  DFStable  mapping  of  an  augmented  version  of  the  message,  (m,  u,  d,  now), 
to  ( u ,  nbrs(u),now) .  This  enables  the  start  of  a  new  DFS  execution  to  forward  the  message  to  its  destination 
(line  37). 

Whenever  the  nbrTO  of  a  message  in  DFStable  times  out,  it  triggers  the  forwarding  of  the  message  to  the 
next  neighbor  in  the  DFS,  if  possible.  If  the  message  hasn’t  yet  been  forwarded  to  all  of  the  relevant  neighbors 
(DFStable(msg).NbrSet  is  not  empty),  then  the  next  neighbor  closest  to  the  destination  VSA  that  has  not 
yet  had  a  message  forwarded  to  it,  curNbr ,  is  selected  and  the  message  tuple  msg  is  then  forwarded  in  a 
forward  message  to  it  using  the  V-bcast  service  (lines  45-48).  The  timeout  variable  DF  Stable(msg) . nbrTO 
for  this  attempt  at  forwarding  is  set  to  now  +  5(curNbr,msg.v2vd)  (line  49).  If  the  message  has  already 
been  forwarded  to  all  the  relevant  neighbors,  then  DF Stable(msg)  is  set  to  null,  indicating  that  nothing 
more  can  be  done. 

If  a  message  tuple  msg  whose  destination  is  VfftoV  is  received  in  a  forward  message  from  isrc,  then 
VSA  Vf(toV  broadcasts  a  (found,  msg )  message  via  the  V-bcast  service  and  VtoVrcv’s  the  message  msg.m. 
The  found  message  notifies  neighbors  still  participating  in  the  DFS  for  msg  that  it  has  reached  its  final 
destination  VSA.  No  forwarding  is  required  (lines  55-57).  Otherwise,  if  msg  is  not  destined  for  VfftoX  and 
Vf'toV  does  not  already  have  an  entry  in  DFStable  for  msg,  then  the  message  must  be  forwarded  to  its 
destination.  DFStable(msg)  is  set  to  (isrc,  nbr s(u)\{isrc\ ,  now)  (line  59),  storing  the  intermediate  source, 
initializing  the  set  of  neighbors  that  have  yet  to  have  the  message  forwarded  to  them,  and  setting  nbrTO  to 
now.  Setting  nbrTO  to  now  immediately  enables  the  DFStimeout  action  for  msg,  triggering  the  forwarding 
of  msg  to  one  of  V^toV,s  neighbors. 

When  a  found  message  is  received  for  a  message  tuple  msg  that  is  mapped  by  DFStable,  the  entry  in 
DFStable  is  erased,  preventing  additional  forwarding  (line  64).  If  m  /  msg.v2vs  then  VSA  Vffto1  broadcasts 
a  found  message  via  the  V-bcast  service  (lines  65-66),  notifying  neighbors  that  are  still  participating  for  msg 
that  it  has  been  delivered.  Clearly,  if  u  =  msg.v2vs,  then  no  found  message  is  required  and  no  further  action 
needs  to  be  taken. 

4.2  Correctness 

We  now  prove  the  correctness  of  the  algorithm.  Let  the  source  VSA  be  V^toV ,  the  destination  VSA  be 
V/toV,  the  message  sent  be  m,  and  a  DFS  execution  exe  from  V^toV  to  V^toV  be  as  defined  above.  We 
assume  a  given  function  (5:{{7}x{[/}— >  A f,  where  S(x,y)  is  a  bound  on  the  time  required  for  a  message 
to  arrive  from  x  to  y.  This  bound  is  based  both  on  the  distance  between  x  and  y,  and  the  quality  of  the 
communication  links  in  the  network.  Since  the  DFS  and  the  S  function  are  just  employed  to  cut  down  on 
unneeded  retransmission  of  messages,  any  non-negative  wait  time  is  sufficient  for  correctness.  However,  a 
wait  time  dependent  on  hop  count  between  regions  will  be  the  most  message-efficient.  We  argue  that  if  no 
corruption  failures  occur  and  the  status  (failed  or  non-failed)  of  every  VSA  in  IA  doesn’t  change  during  exe, 
then  the  following  holds: 

Lemma  4.1  If  V^toV  is  a  non-failed  VSA  that  performs  a  VtoVsend (d,m)  at  time  t,  and  there  exists  a 
path  of  non-failed  VS  As  between  V^to1  and  V^toV  from  time  t  to  time  t  +  ttlytoV >  then  V^toV  performs  a 
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VtoVrcv(m)  in  the  interval  [t,  t  +  ttlytoy],  for  ttlytoy  >  [e  +  d  +  (inaxu  veub(u,  v)  •  maxu^u\nbrs{u)\  —  1)]  • 

(K;  -  D- 

Proof  sketch:  The  proof  is  by  induction  on  the  distance  n  between  s  and  VAtoV  on  the  shortest  non-deserted 
path,  where  the  distance  is  the  number  of  VSAs  along  the  path,  including  V^toV .  In  the  case  n  =  0,  the 
message  m  is  destined  for  the  same  VSA.  According  to  line  35,  the  message  is  VtoVrcv’ed  at  the  VSA. 

Let’s  assume  that  the  lemma  holds  for  every  n!  <  n. 

Let  n  be  the  VSA-distance  between  V)'toV  and  Vrj'toV .  There  exists  a  path  of  non-failed  VSAs  between 
V)/toV  and  V/toV .  Therefore,  there  exists  a  VSA  V^toV ,  which  is  a  neighbor  of  V^toV ,  such  that  there 
exists  a  path  of  non-failed  VSAs  between  V)ftoV  and  VrJ  toV .  The  distance  between  Vf'tcA  and  Vr}  toV  is 
n  —  1,  hence  the  induction  assumption  holds  for  VpoV  and  V^to1  .  Therefore,  a  message  sent  from  V^to1 
to  Vp,A  eventually  reaches  V)/toV .  The  same  assumption  holds  for  VpoV  and  V^toV ,  therefore,  V)/UA' 
receives  the  message  m  sent  from  region  s.  ■ 

Lemma  4.2  The  number  of  times  that  a  message  tuple  msg  is  re-broadcast  is  bounded. 

Proof  sketch:  The  broadcast  of  a  message  tuple  stops  in  either  of  the  following  cases: 

•  A  found  message  was  received  for  msg.  According  to  line  62,  if  the  value  of  DF Stablefmsg)  was  not 
already  null,  it  gets  set  to  null,  preventing  V))to1  from  doing  anything  with  subsequent  found  messages. 
If  V)ftoX  was  not  the  original  source  of  msg,  it  retransmits  found  for  msg  exactly  one  time.  If  a  found 
for  msg  is  received  again,  it  will  be  ignored.  A  forward  message  for  msg  would  need  to  be  received 
again  in  order  to  result  in  any  additional  found  mesages  for  msg  at  this  VSA.  This,  however,  cannot 
happen  since  each  VSA  participating  in  the  DFS  waits  before  triggering  new  forward  messages  until 
found  messages  would  have  been  returned. 

•  For  each  VSA  neighbor,  if  VSA  V^toV  does  not  receive  a  found  message  for  msg  it  will  time  out  via 
nbrTO.  Once  the  set  of  neighbors  to  be  queried  is  exhausted,  the  VSA  erases  the  entry  for  msg  in 
DFStable,  preventing  any  additional  forwarding  by  itself. 


Lemma  4.3  Once  corruptions  stop  and  the  VSA  layer  has  stabilized,  it  takes  up  to  d  +  ttlytoy  time  for 
VtoVComm  to  stabilize. 

Proof  sketch:  Any  message  in  the  system  that  is  being  forwarded  by  VtoVComm  will  be  cleaned  out  of 
the  system  if  they  are  older  than  ttlytoV  or  newer  than  the  current  time.  As  a  result,  the  longest  a  “bad” 
message  can  be  in  the  system  is  this  time,  plus  up  to  an  additional  d  time  where  it  could  have  been  in 
transmission  before  being  received  by  a  VSA.  ■ 


5  Home  Location  Service  (HLS)  implementation 

The  location  service,  as  described  in  the  last  section,  allows  a  client  to  determine  a  recent  region  of  another 
alive  client.  In  our  implementation,  called  the  Home  Location  Service  (HLS),  we  accomplish  this  using  home 
locations.  Recall  that  the  home  locations  of  a  client  node  p  are  /  +  1  regions  whose  VSAs  are  occasionally 
updated  with  p’s  region.  The  home  locations  are  calculated  with  a  hash  function  h ,  mapping  a  client’s  id  to 
a  list  of  VSA  regions,  and  is  known  to  all  VSAs.  These  home  location  VSAs  can  then  be  queried  by  other 
VSAs  to  determine  a  recent  region  of  p. 

Figure  6  depicts  how  the  VSA  abstraction  and  VtoVComm  are  used  in  HLS.  The  HLS  implementation 
consists  of  two  parts:  a  client-side  portion  and  a  VSA-side  portion.  C^L  is  a  subautomaton  of  client  p 
that  interacts  with  VSAs  to  provide  HLS.  It  is  responsible  for  notifying  VSAs  in  its  current  and  neighboring 
regions  which  region  it  is  in.  Also,  Cp  handles  each  request  submitted  by  input  HLquery((7)p  for  g’s  region, 
by  broadcasting  the  query  via  V-bcast  to  VSAs  V^L  in  its  current  and  neighboring  regions.  It  translates 
responses  from  the  VSAs  into  HLreply  outputs. 
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Figure  6:  Home  Location  Service.  A  client  p  can  query  local  VSAs  for  client  q' s  region.  The  VSAs  then 
query  home  locations  of  q ,  using  VtoVComm,  for  a  recent  region  of  q ,  and  return  it  to  p. 

For  the  VSA-side,  V£L  and  VvHL  in  Figure  6  are  home  location  VSAs  corresponding  to  regions  u  and  v 
of  the  network;  they  are  subautomata  of  VSAs  Vu  and  Vv.  V^L  takes  a  request  from  a  local  client  for  client 
node  q’s  region,  calculates  q' s  home  locations  using  the  hash  function,  and  then  sends  location  queries  to  the 
home  locations  using  VtoVComm.  Those  virtual  automata  respond  with  the  region  information  they  have 
for  q ,  which  is  then  provided  by  V^L  to  the  requesting  client.  V^L  also  is  responsible  both  for  informing 
the  home  locations  of  each  client  p  located  in  its  region  or  neighboring  regions  of  p’s  region,  and  maintaining 
and  answering  queries  for  the  regions  of  clients  for  which  it  is  a  home  location. 

Time  and  region  information  from  the  GPS  oracle  is  used  throughout  the  HLS  algorithm,  by  clients  and 
VSAs,  to  timestamp  and  label  information  and  messages.  This  information  is  used  to  guarantee  timeliness 
of  replies  from  the  HLS  service,  and  to  stabilize  the  service  after  faults.  Timestamps  are  used  to  determine 
if  information  is  too  old  or  too  new,  while  region  information  allows  clients  and  VSAs  to  know  which  other 
clients  and  VSAs  to  interact  with. 

5.1  HLS  client  actions 

The  code  executed  by  client  p’s  C^L  is  in  Figure  7. 

Clients  receive  GPSupdates  every  esampie  time  from  the  GPS  automaton  (lines  28-33),  making  them  aware 
of  their  current  region  and  the  time.  If  a  client’s  region  has  changed,  the  client  immediately  sends  a  heartbeat 
message  with  its  id,  current  time  and  region  information.  The  client  periodically  reminds  its  current  and 
neighboring  region  VSAs  of  its  region  by  broadcasting  additional  heartbeat  messages  every  ttlhb  time,  where 
ttlhb  is  a  known  constant  (lines  35-39). 

CpL  also  handles  the  HLquery((7)  inputs  it  receives  (line  41).  This  request  for  g’s  location  is  stored  in 
a  queryq  table  and,  once  the  client  knows  its  own  region,  translated  into  a  (clocQuery,  q)  message  that  is 
broadcast,  together  with  the  VSA  region,  to  local  regions’  VSAs  (lines  45-49).  If  CpL  eventually  receives  a 
(clocReply,  q ,  qreg)  message  from  its  current  or  neighboring  region’s  VSA  for  a  client  q  in  queryq ,  indicating 
that  node  q  was  in  region  qreg  (lines  51-55),  it  clears  the  entry  for  q  in  queryq ,  and  outputs  a  H Lreply(<7,  qreg) 
of  the  information  (lines  57-61).  If  the  request  for  g’s  location  goes  unanswered  for  more  than  ttlnLS  —  t sample 
time,  then  the  request  has  failed  and  is  removed  (lines  63-67). 

5.2  HLS  VSA  actions 

The  code  for  automaton  V^L  appears  in  Figure  8. 

First,  the  VSA  knows  which  clients  are  in  its  or  neighboring  regions  through  heartbeat  messages.  If  a 
VSA  hears  a  heartbeat  message  from  a  client  p  claiming  to  be  in  its  region  or  a  neighboring  region,  the 


11 


Constants: 

Output  bcast((heartbeat,  now ,  p),  reg)p 

2 

ttlhb 

Precondition: 

ttlHLS 

hbTO  <  now  A  reg  ^  _L 

4 

Effect: 

Signature: 

hbTO  «—  now  +  ttlhb 

6 

Input  GPSupdate(-y,  t)p,  v  E  U,  t  E  M 

Input  HLquery(g,)p,  q  E  P 

Input  HLquery(g,)p 

8 

Input  brcv((m,  u))p ,  m  E  ({clocReply}  x  P  x  U  X  U),  u  E  U 

Effect: 

Output  bcast((m, reg))p, raE  (heartbeat, now, p)U{clocQuery}x P 

query q(q)  <—  oo 

10 

Output  HLreply(g,i;)p,  q  £  P,  v  £  U 

Internal  queryfail(g)p,  q  E  P 

Output  bcast(((clocQuery,  q),  reg))p 

12 

Precondition: 

State: 

reg  ^  _LA  query q(q)  >  now  +  ttlHLS  -esamPle 

14 

analog  now  E  M,  current  real  time,  initially  _L 

Effect : 

hbTO  <  now  +  ttlhb,  £  the  next  heartbeat  time 

queryq(q)  <-  now  +  ttlHLS  -eaarnpie 

16 

reg  E  U,  the  current  region,  initially  _L 

query q,  a  table  from  P  to  1,  initially  0 

Input  brcv(((clocReply,  q,qreg) ,u))p 

18 

queryrcv,  a  queue  of  P  x  U  pairs,  initially  0 

Effect: 

if  (uE  nbrs(reg) U  {reg} A  query q(q)^null)  then 

20 

Trajectories: 

queryrcv  queryrcv  U  {(#,  qreg)} 

satisfies 

query q(q)  null 

22 

d  (now)  =  1 

constant  hbTO ,  reg,  query q,  queryrcv 

Output  HLreply^,  qreg)p 

24 

stops  when 

Precondition: 

Any  precondition  is  satisfied. 

(q,  qreg)  E  queryrcv 

26 

Effect: 

Actions: 

queryrcv  queryrcv  \  { ( q,  qreg) } 

28 

Input  GPSupdate(?;,  t)p 

Effect: 

Internal  queryfail^p 

30 

now  t 

Precondition: 

if  reg  7^  v  then 

query q(q)  <  now 

32 

reg  <—  v 

Effect: 

hbTO  <—  now 

query q(q)  <—  null 

Figure  7:  HLS’s  C^L  automaton.  This  client  subautomaton  serves  as  a  bridge  between  the  client’s 
requests  and  the  VSA  layer. 

VSA  sends  a  locUpdate  message  for  p,  with  p’ s  heartbeat  timestamp  and  region,  through  VtoVComm  to  the 
VSAs  at  home  locations  of  client  p  (lines  42-46),  where  home  locations  are  computed  using  the  known  hash 
function  h  from  P  x  {1,  •  •  •  ,  /  +  1}  to  U. 

When  a  VSA  receives  one  of  these  locUpdate  messages  for  a  client  p,  it  stores  both  the  region  indicated 
in  the  message  as  p’s  current  region  and  the  attached  heartbeat  timestamp  in  its  loc  table  (lines  48-51). 
This  location  information  for  p  is  refreshed  each  time  the  VSA  receives  a  locUpdate  for  client  p  with  a  newer 
heartbeat  timestamp.  Since  a  client  sends  a  heartbeat  message  every  ttlhb  time,  which  can  take  up  to  d  +  e 
time  to  arrive  at  and  trigger  a  VSA  to  send  a  locUpdate  message  through  VtoVComm,  which  can  take 
ttlytoV  time  to  be  delivered  at  a  home  location,  an  entry  for  client  p  is  erased  if  its  timestamp  is  older  than 
ttlhb  +  d  +  e  +  ttlytoV  (lines  53-57). 

The  other  responsibility  of  the  VSA  is  to  receive  and  respond  to  local  client  requests  for  location  infor¬ 
mation  on  other  clients.  A  client  p  in  a  VSA’s  region  or  a  neighboring  region  v  can  send  a  query  for  g’s 
current  location  to  the  VSA.  This  is  done  via  a  mobile  node’s  broadcast  of  a  ((clocQuery,  q),v)  message. 
When  the  VSA  at  region  u  receives  this  query,  if  no  outstanding  query  for  q  exists,  it  notes  the  request  for  q 
in  Iquery(q),  and  sends  a  vIocQuery  message  to  q’s  f  +  1  home  locations,  querying  about  g’s  location  (lines 
59-65).  Any  home  location  that  receives  such  a  message  and  has  an  entry  for  g’s  region  responds  with  a 
vIocReply  to  the  querying  VSA  with  the  region  (lines  67-70). 

If  the  querying  VSA  at  u  receives  a  vIocReply  in  response  to  an  outstanding  location  request  for  a  client 
g,  it  stores  the  attached  region  information  in  Iquery(q)  (lines  72-75),  broadcasts  a  clocReply  message  with 
q  and  its  region  to  local  clients,  and  erases  the  entry  for  lquery{q)  (lines  77-81).  If,  however,  2 ttlytov  +  2e 
time  passes  since  a  request  for  g’s  region  was  received  by  a  local  client  and  there  is  no  entry  for  g’s  region, 
Iquery(q)  is  just  erased  (lines  83-87). 


Constants: 

Input  VtoVrcv((u,  (locUpdate,  q,t)))u 

2 

ttlytoV 

Effect: 

ttlhb 

if  loc(q).ts  <  t  <  now  then 

4 

h,  a  hash  function  from  P  x  {1,  •  •  •  ,  /  +  1}  to  U 

loc(q)  <—  (t ;,  t) 

such  that  for  p  E  P,  x,y  E  {1,  •  •  •  ,  f  +  1}, 

6 

if  x  7^  y,  then  h(p,x)  7^  h(p,y) 

Internal  cleanLoc(^)ti 

Precondition: 

8 

Signature: 

loc(q).ts  (fc.  [now  -ttl^b  -d  -e  - ttlytQy ,  now] 

Input  brcv((m,  v))u ,  m  E  ({heartbeat} x  M  x  P) 

Effect: 

10 

U  ({clocQuery}  x  P),  v  E  P 

loc(q)  <—  null 

Input  VtoVrcv((u,  m))u,  v  E  C/,m£  ({locUpdate}  x  Px 

12 

R)U  ({vlocQuery}x  P)U  ({vIocReply}  x  Px  P) 

Input  brcv(((clocQuery,  q),v))u 

Output  bcast(((clocReply,  q ,  qreg),  u))u,q€  P,  qreg£  U 

Effect: 

14 

Output  VtoVsend(u,  m)ti,  u  E  U 

if  ([^nery(g)  =  null  V  Iquery(q) .to  <  now] 

Internal  updateHL(q,)u,  q-  E  P 

Av  E  n6rs(n)U  {n})  then 

16 

Internal  cleanLoc(q')ti,  <7  E  P 

Iquery(q)  <—  ( now  +  2 ttlytoV  +  2e,  _L) 

Internal  cleanLquery(q,)ti , q  E  P 

for  i  =  1  to  /+1 

18 

State: 

vtovq  vtovq  U  {(/i(<7,i),  (n,  (vIocQuery,  9)))} 

20 

loc ,  a  table  indexed  on  process  ids  with  entries 

Input  VtoVrcv((u,  (vIocQuery,  <7)))™ 

from  U  x  M-°,  of  the  form  ( reg,ts ) 

Effect: 

22 

Iquery ,  a  table  indexed  on  process  ids  with  entries 

if  loc(q)  7^  null  then 

from  R-°  x  P,  of  the  form  (to,  qreg) 

vtovq  vtovq  U  {(u,  (n,  (vIocReply,  q,  loc(q).reg)))} 

24 

26 

vtovq ,  a  queue  of  tuples  from  U  X  msp 

(Above  all  initially  empty) 

analog  now  El-  ,  the  current  real  time 

Input  VtoVrcv((t;,  (vIocReply,  q ,  qreg)))u 

Effect: 

if  Iquery(q)  7^  null  then 

28 

Trajectories: 

Iquery(q) . qreg  <—  qreg 

satisfies 

Output  beast (((clocReply,  q ,  Iquery (q) .qreg) ,  n))u 

30 

d  (now)  =  1 

constant  loc ,  Iquery ,  vtovq 

stops  when 

32 

Precondition: 

Iquery (q). qreg  7^  _L 

Effect: 

Iquery(q)  <—  null 

34 

Any  precondition  is  satisfied. 

Actions: 

Internal  cleanLquery(g,)li 

Precondition: 

36 

Output  VtoVsend(r»,  m)u 

Precondition: 

38 

(v,  m)  E  vtovq 

Effect: 

Iquery (q) .to  (£  [now,  now  +  2 ttlyt0y  +  2e] 

Effect: 

40 

vtovq  <—  vtovq  \  {(t>,  m)} 

Iquery(q)  <—  null 

42 

Input  brcv(((heartbeat,  £,  p),  v))u 

Effect: 

44 

if  (t>  E  nbrs(u) U  {n}A  now  -d  <  t  <  now)  then 

for  i  =  1  to  /+1 

46 

vtovq  <—  vtovq  U  {(/if#,  i),  (t',  (locUpdate,  q ,  £)))} 

Figure  8:  HLS’s  automaton. 
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5.3  Correctness 

We  make  the  system  assumptions  described  in  Section  3.  Call  Cq  the  first  global  configuration  where  the 
system  is  consistent.  For  the  following  two  lemmas  and  theorem,  assume  we  are  in  a  configuration  after  C'g, 
and  that  no  corruption  failures  occur. 

Lemma  5.1  For  any  VSA  u,  if  there  is  a  request  for  q’s  region  in  Iquery,  it  was  submitted  through  a 
HLquery(gr)  at  a  client  within  the  last  esampie  +  d  +  2 ttlytoV  +  2e  time. 

Proof  sketch:  Once  a  request  is  submitted  by  a  client  to  CpL,  if  the  client  has  not  ever  received  a  GPSupdate, 
it  can  take  up  to  esampie  time  for  the  client  to  receive  one.  After  the  client  has  received  one,  it  then  broadcasts 
the  request  to  local  VSAs,  which  takes  up  to  d  time  to  be  delivered.  VSAs  then  hold  these  queries  until  they 
expire  2 ttlytoV  +  2e  later.  ■ 

Lemma  5.2  Starting  eaarnpie  +  d  +  e  +  ttlytoV  time  after  client  p  enters  the  system  and  until  p  fails,  for 
each  interval  of  length  ttlytoV  +  e,  all  but  f  of  p’s  home  locations  will  have  a  non-null  loc(p)  entry  for  the 
entire  interval.  If  client  p  is  alive  and  there  is  some  VSA  u  such  that  loc(p)  is  not  null,  p  was  alive  and 
located  in  loc(p).reg  within  the  last  esampie  +  d  +  e  +  ttlytoV  time. 

Proof  sketch:  Within  esarnpie  time  of  a  client  entering  the  system,  a  GPSupdate  occurs  and  the  client  trans¬ 
mits  a  heartbeat  message.  This  message  can  take  up  to  d  time  to  be  received  by  a  nearby  VSA,  after  which 
it  can  take  e  +  ttlytov  time  for  the  VSA  to  transmit  the  associated  locUpdate  message  to  the  client’s  home 
locations  and  have  the  message  be  received,  updating  any  alive  home  locations’  loc{p)  entries.  Since  for  any 
interval  of  length  ttlhb  +  d  +  2e  +  ttlytoV ,  at  most  /  of  the  client’s  home  locations  can  be  failed  at  any  point 
in  the  interval,  all  but  /  of  the  client’s  home  locations  will  receive  a  locUpdate  message  and  have  a  non-null 
loc(p)  entry,  and  will  remain  alive  with  a  non-null  loc{p)  entry  for  at  least  ttlytoV  +  e  after  the  next  locUpdate 
message  is  received  (within  ttlhb  +  d  +  e  +  ttlvtoV  time  after  the  first  was  sent).  Since  this  is  true  for  each 
locUpdate  message,  there  can  only  be  /  home  locations  that  either  do  not  have  a  non-null  loc{p)  entry  or 
that  will  not  be  alive  for  an  additional  ttlytoV  +  e  time. 

For  the  second  statement,  note  that  an  alive  client  p  will  send  a  heartbeat  message  within  eaampie  time  of 
arriving  in  a  region,  prompting  updates  to  loc(p)  at  alive  home  locations  within  d  +  e  + ttlytoV  time.  Hence, 
if  a  client  is  alive,  any  non-null  entry  for  loc(p).reg  can  only  be  as  old  as  esampie  +  d  +  e  +  ttlytoV ■  ■ 

Theorem  5.3  Every  client  p  searching  for  a  non-failed  client  q  that  has  been  in  the  system  longer  than 
ttlu ls  +  eaampie  +  d  +  ttlytoV  +  e  time  will  perform  a  HLreply(g,  qreg)  within  time  UIhls,  such  that  q  was 
located  in  region  qreg  no  more  than  UIhls  time  ago.  No  reply  will  occur  if  q  has  been  failed  for  more  than 
ttlhb  +  UIhls  ~  esampie  time.  Any  reply  is  in  response  to  a  query. 

Proof  sketch:  For  the  first  statement,  by  the  previous  lemma,  we  know  that  once  client  q  has  been  in  the 
system  for  esampie  +  d  +  e  +  ttlytoV  time,  any  queries  of  its  home  locations  will  succeed  in  producing  a 
result.  However,  a  new  HLquery  request  “piggybacks”  on  any  prior  unexpired  HLquery  requests.  Since  one 
of  these  requests  could  have  been  initiated  just  before  the  client  q’s  home  locations  are  updated,  we  can  only 
guarantee  a  response  will  be  received  for  a  new  request  if  any  outstanding  requests  will  be  answered.  If  the 
client  has  been  in  the  system  for  this  total  UIhls  +  d  +  e  +  ttlytoV  time  after  receiving  its  first  GPSupdate, 
then  any  response  to  a  query  can  take  as  much  as  UIhls  time:  eaampie  time  for  the  querying  client  to  receive 
its  first  GPSupdate,  d  time  for  the  query  to  be  transmitted  and  received  by  a  local  VSA,  e  +  ttlytov  for  the 
local  VSA  to  query  a  home  location,  e  +  ttlytov  for  the  response  to  arrive  at  a  local  VSA,  e  time  for  the  local 
VSA  to  transmit  the  response  to  its  requesting  clients,  and  d  time  for  the  transmission  to  be  received  and 
translated  into  HLreplys  at  clients.  This  total  is  UIhls ■  As  for  the  age  of  the  response,  by  the  prior  lemma, 
we  know  that  information  can  only  be  out  of  date  by  esampie  +  ttlytov  +  e  +  d  time  when  a  home  location 
responds  to  a  query  by  another  VSA.  The  response  can  take  e  +  ttlytov  time  to  arrive  at  the  querying  VSA, 
followed  by  e  +  d  time  for  the  querying  VSA  to  get  the  information  to  the  clients  that  prompted  the  query. 
The  oldest  the  information  could  be  is  the  total. 

For  the  second  statement,  note  that  a  failed  client  will  not  send  a  heartbeat  message.  Since  loc(p )  entries 
are  cleared  once  ttlhb  +  d  +  e  +  ttlytov  time  has  passed  since  the  heartbeat  message  upon  which  it  was  based 
was  broadcast,  and  the  information  from  the  entry  can  only  take  as  much  as  e  +  ttlytov  time  to  reach  a 
querying  VSA  and  e  +  d  time  to  reach  any  querying  clients,  the  total  is  the  maximum  time  a  HLreply  can 
occur  after  the  client  fails. 

For  the  third  statement,  note  that  a  query  expires  after  UIhls  time.  Hence,  any  response  generated 
must  be  for  a  query  that  occurred  no  more  than  that  time  before.  ■ 
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Theorem  5.4  Starting  from  an  arbitrary  configuration,  after  VtoVComm  has  stabilized,  it  takes  maxfttlnLS >  2e+ 
3 ttlytov  +  ttlhb  +  2d)  time  for  HLS  to  stabilize. 

Proof  sketch:  Once  lower  levels  have  stabilized,  most  client  state  is  made  locally  consistent  within  esarnpie 
time,  the  time  for  the  client  to  get  a  GPSupdate.  This  action  resets  most  variables  if  the  region  is  updated. 

The  remaining  portions  of  client  state  are  made  consistent  instantaneously  with  local  correction  actions,  with 
the  exception  of  the  heartbeat  timer  and  query q  variables.  The  heartbeat  timer  can  only  affect  operations 
for  at  most  ttlhb  time.  The  query q  variable  can  only  affect  operations  for  UIhls  time,  when  it  would  be 
deleted. 

For  VSAs,  there  are  two  variables  that  are  not  instantaneously  corrected:  loc  and  Iquery. 

The  loc  variable  will  be  consistent  within  time  e+2 ttlytov  Uttlhb  +  d.  At  worst,  there  could  be  a  corrupted 
message  that  arrives  at  a  VSA  after  ttlytov  time,  adding  a  bad  entry  that  takes  e+ ttlytov  +  ttlhb  +  d  time  to 
expire.  If  the  client  referred  to  is  in  the  system,  it  might  not  be  until  the  next  update  after  the  timestamp  of 
the  corrupted  message  (which  could  have  been  delivered  as  late  as  ttlytov  after  corruptions  stopped)  arrives 
for  the  information  to  be  cleaned  up.  This  time  is  exactly  what  the  offset  term  for  loc  timeouts  describes. 
Hence,  the  variable  might  not  be  cleaned  until  ttlytov  plus  that  offset  term. 

However,  there  may  be  responses  based  on  this  bad  loc  table  information  that  were  sent  right  at  e  + 

2 ttlytov  +  ttlhb  +  d,  and  that  take  e  +  ttlytov  to  arrive  at  the  VSA.  The  resulting  transmission  (taking  d 
time  to  complete)  to  local  clients  is  then  incorrect.  However,  those  incorrect  transmissions  cease  after  the 
total  time  2e  +  3 ttlytov  +  ttlhb  +  2d  elapses. 

The  Iquery  variable  is  cleaned  up  within  UIhls  time.  An  entry  in  Iquery  only  has  a  total  of  2ttlvtoV  +  2e 
time  in  the  data  structure.  It  could  be  the  case  that  a  spurious  request  was  transmitted  in  the  beginning, 
which  adds  d  time.  If  a  region  response  is  received  it  results  in  immediate  correction  of  the  state  through 
erasure.  Hence,  the  time  required  to  be  consistent  is  the  time  that  it  takes  for  a  query  to  be  accounted  for. 

The  maximum  of  UIhls  and  2e  +  3 ttlytov  +  ttlhb  +  2d  is  the  maximum  stabilization  time.  ■ 

5.4  Extensions 

Here  we  briefly  describe  some  possible  extensions  to  our  HLS  algorithm: 

Home  location  voting  mechanisms:  In  systems  where  corruption  failures  are  limited  in  number  at  the 
VSA  level,  our  implementation  could  be  extended  to  use  a  voting  mechanism,  allowing  the  “weed-out”  of 
information  from  corrupted  home  locations.  Rather  than  querying  VSAs  waiting  for  a  single  region  response 
from  a  home  location  VSA,  they  could  wait  until  the  same  region  is  returned  from  a  majority  of  home 
locations  VSAs.  If  corruption  is  limited  to  some  small  number  of  VSAs  at  a  time,  but  can  happen  often, 
then  this  voting  mechanism  can  be  used  to  provide  a  stronger  location  service,  immune  to  these  limited 
number  of  faults. 

Randomized  asymmetric  quorums:  It  is  possible  to  have  asymmetric  updates  and  queries,  such  as  with 
local  updates  to  close-by  VSAs  and  uniformly  selected  VSAs  or  vice  versa  (the  expected  number  of  VSAs 
that  are  required  to  be  updated  and  queried  is  small,  as  proved  in  [22]).  Instead  of  using  a  predefined  set 
to  query,  one  might  use  a  randomized  scheme  based  on  [22],  where  a  random  set  of  regions  is  chosen  for 
updating  and  inquiring  about  the  location  of  a  client  node.  Moreover,  we  could  enhance  the  scheme  in  [22] 
by  using  a  predefined  set  for  location  updates  (such  as  the  close-by  regions)  and  random  set  for  location 
queries  (or  vice  versa). 

Attribute  queries:  There  are  scenarios  in  which  one  would  like  to  query  for  client  nodes  with  certain 
attributes  in  a  geographic  area  (e.g.,  a  search  for  a  medical  doctor  that  is  currently  near  by).  Our  scheme 
supports  such  queries  in  a  natural  way:  Attributes  can  hash  to  home  locations  that  store  tables  of  clients 
with  the  attribute,  and  their  locations.  Clients  searching  for  another  nearby  client  with  some  attribute  could 
then  have  a  local  VSA  query  home  locations  for  the  attribute,  and  select  a  nearby  client  from  the  list  that 
is  returned. 

6  Client  end-to-end  routing  (EtoEComm)  implementation 

Our  implementation  of  the  end-to-end  routing  service,  EtoEComm,  uses  the  location  service  to  discover  a 
recent  region  location  of  a  destination  client  node  and  then  uses  this  location  in  conjunction  with  VtoVComm 
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Figure  9:  End-to-end  routing.  A  client  CE2E  can  send  a  message  to  another  client  CE2E  by  querying  HLS 
for  q' s  region,  and  then  having  local  VSAs  forward  the  message  to  g’s  local  regions  through  VtoVComm. 
The  message  is  received  by  those  VSAs  and  broadcast  for  delivery  by  CE2E . 


to  deliver  messages  (see  Figure  9).  As  in  the  implementation  of  the  Home  Location  Service,  there  are  two 
parts  to  the  end-to-end  routing  implementation:  the  client-side  portion  and  the  VSA-side  portion.  Also  as  in 
HLS,  time  and  region  information  from  the  GPS  oracle  is  used  throughout  this  implementation  to  timestamp 
and  label  information. 

The  client-side  portion  CE2E  takes  a  request  to  send  a  message  to  another  client  q ,  queries  the  HLS 
for  q’s  location,  and  submits  the  message  to  have  it  sent  by  a  VSA  in  its  current  or  neighboring  regions 
to  g’s  location.  It  also  takes  messages  originating  at  other  clients  and  transmitted  to  it  by  its  current  or 
neighboring  regions’  VSAs,  and  delivers  them. 

The  VSA  VE2E  portion  is  very  simple.  A  client  may  send  it  information  to  be  transmitted  to  other 
VSAs,  which  it  forwards  through  VtoVComm,  or  another  VSA  may  send  it  information  to  be  delivered  at  a 
client  in  its  own  or  a  neighboring  region,  which  it  forwards  through  V-bcast. 

6.1  EtoEComm  client  actions 

The  signature,  state,  and  actions  of  CE2E  are  in  Figure  10.  The  main  variable  phbook  is  a  table,  indexed 
on  destination  pid,  with  entries  of  the  form  ( reg, ttl, msg ).  For  a  client  q ,  phbook(q) .reg  stores  the  current 
region  of  q  (unless  it  is  unknown,  in  which  case  it  is  _L).  The  field  ttl  stores  a  timeout  for  phbook(q).reg  if 
the  region  of  q  is  known  and  stores  a  timeout  for  querying  for  the  region  if  not.  The  set  msg  stores  messages 
being  sent  to  q. 

The  GPSupdate(v,  t)  action  (line  36)  results  in  an  update  of  the  client’s  reg  variable  to  the  region  v 
indicated  in  the  action  and  a  reset  of  the  local  clock. 

A  message  m  is  sent  to  another  client  q  via  send(g,m)p.  This  input  to  CE2E  results  in  the  forwarding 
of  the  message  to  p’s  current  region  it’s  VSA  through  bcast(((sdata,  m.,q,phbook(q).reg),p,u))  if  a  region 
phbook(q).reg  for  q  is  known  (line  44-45),  or  the  saving  of  the  message  in  phbook(q) .msg ,  if  the  client  does 
not  have  the  location  of  q  (lines  46-48). 

If  a  recent  region  for  q  is  not  known,  CE2E  attempts  to  discover  one.  It  queries  HLS  to  determine  where  q 
was  through  the  HLquery(g)p  action  (line  50).  A  timeout  for  response  to  the  location  request,  phbook (q). ttl, 
is  set  for  UIhls  later.  If  the  timeout  expires  but  no  messages  are  waiting  to  be  sent,  cleanPhbook(g)  erases 
the  entry,  preventing  unnecessary  HLquerying  (line  63). 

Once  a  response  to  an  HLquery(q)  is  received  from  HLS  in  the  form  of  HLreply(g,  qreg)p  (line  57),  indicating 
q  was  in  region  qreg,  entry  phbook(q).reg  is  updated  to  qreg  and  phbook (q) .ttl  is  updated  to  now  +  ttlpb, 
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Constants: 

Input  send (q,m)p 

42 

2 

ttlHLS 

Effect: 

ttlpb 

if  ( phbook(q) .reg  7^  _L A  phbook( q) . ttl  >  now)  then 

44 

4 

Signature: 

sdataq  sdataq  U  {(m,  q,  phbook(q) .reg)} 

else  if  ( phbook(q)=  nullW  phbook(q) .ttl<  now)  then 

46 

6 

Input  HLreply (q,v)p,q  E  P,v  E  U 

phbook(q)  <—  (_L,  _L,  {m}) 

Input  ser\6(q,  m)p,  q  E  P 

else  phbook(q) .msg  <—  phbook(q).msg  U  {m} 

48 

8 

Input  GPSupdat e(v,t)p,v  E  U,  t  E  R 

Input  brcv(((rdata,  m),p,  u))p,  u  E  U 

Output  HLquery(g,)p 

50 

10 

Output  bcast(ra)p 

Precondition: 

Output  HLquery(q,)p, q  E  P 

phbook(q)  =  (_L,  ttl,  m/  0) 

52 

12 

Output  receive(m)p 

A  ( ttl  =  _LV  ttl  >  now  +  UIhls) 

Internal  cleanPhbook(g)p,  q  E  P 

Effect: 

54 

14 

State: 

phbook(q).ttl  <—  now  +  ttl hls 

56 

16 

analog  now  E  R,  current  real  time,  initially  _L 

Input  HLreply {q,qreg)p 

reg  E  U,  the  current  region,  initially  _L 

Effect: 

58 

18 

phbook,  a  table  indexed  on  process  id  with  entries  from 

for  each  m  E  phbook(q).msg 

U  X  R  x  2rns9 ,  of  the  form  ( reg,  ttl,  msg ),  initially  0 

sdataq  sdataq  U  {(m,  q,  qreg)} 

60 

20 

sdataq,  deliverq ,  queues  of  messages,  initially  0 

phbook(q)  <—  (qreg,  now  +  ttlpb,$) 

62 

22 

Trajectories: 

Internal  cleanPhbook(^)p 

satisfies 

Precondition: 

64 

24 

d  (now)  =  1 

phbook(q)=  (qreg,  ttl,  msg) A  [(qreg  =  _LA  msg  =  0) 

constant  reg,  phbook,  sdataq ,  deliverq 

V  ( qreg  7^  _LA  [ttl>  now-\-ttlpi)\/  msg  7^  0])V  ttl<  now] 

66 

26 

stops  when 

Effect: 

28 

Any  precondition  is  satisfied. 

phbook(q)  <—  null 

68 

Actions: 

Input  brcv(((rdata,  m) ,p,u))p 

70 

30 

Output  bcast(((sdata,  m,  q,  qreg),  p,  reg))p 

Effect: 

Precondition: 

if  u  E  {reg}  U  nbrs(reg) 

72 

32 

(m,  q,  qreg)  E  sdataq  A  reg  7^  _L 

deliverq  deliverq  U  {m} 

Effect: 

74 

34 

sdataq  <—  sdataq  \  {(ra,  q,  qreg)} 

Output  receive(m)p 

Precondition: 

76 

36 

Input  GPSupdate(?;,  t)p 

m  E  deliverq 

Effect: 

Effect: 

78 

38 

now  t 

deliverq  deliverq  \  {m} 

if  reg  7^  v  then 

40 

reg  <—  v 

Figure  10:  EtoEComm’s  CE2E  automaton. 

storing  the  location  of  q  and  setting  a  timeout  for  use  of  the  location  information.  For  each  message  waiting 
to  be  sent  to  q  in  queue  phbook(q).msg,  the  message,  with  the  location  information  for  the  destination, 
is  forwarded  to  p's  current  and  neighboring  regions’  VSAs  through  a  bcast(((sdata,  m,q,qi'eg),p,u))  (lines 
59-60,  30-34). 

Messages  for  client  p  from  other  clients  are  received  from  p's  current  region  or  a  neighboring  region  v’s 
VSA  through  brcv(((rdata,  m),p,  v))p  (line  70).  The  message  m  is  subsequently  delivered  through  the  output 
receive(m)p  (line  75). 

6.2  EtoEComm  VSA  actions 

The  signature,  state,  and  actions  of  VE2E  are  in  Figure  11. 

The  receipt  of  a  message  m  to  be  sent  from  a  client  p  to  q  at  qreg  through  brcv(((sdata,  m,  g,  qreg),p,  v)), 
v  either  u  or  a  neighbor  (line  33)  results  in  the  subsequent  forwarding  of  the  message  to  the  virtual  automata 
at  regions  in  calcregs (qreg)  and  their  neighboring  regions,  via  the  virtual  automata  communication  action 
VtoVsend(greg,  (data,  m,q))u  (line  33-38).  The  set  calcregs  (qreg)  contains  the  regions  that  q  could  occupy 
by  the  time  the  message  is  delivered  to  it  (since  we  do  not  require  the  client  to  be  stationary  during  execution 
of  the  algorithm).  As  will  be  seen  shortly,  the  definition  of  calcregs  is  dependent  on  assumptions  about  client 
mobility. 

Likewise,  the  receipt,  via  VtoVrcv((data,  m,p))u  (line  40),  of  message  m  intended  for  client  p  results  in 
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Signature: 

Actions: 

20 

2 

Input  VtoVrcv((data,  m,p))u,p  E  P 

Output  bcast(m)n 

Input  brcv(((sdata,  m,  q,  qreg) ,p,v))u,  P,Q€  P,  qreg,v€  U 

Precondition: 

22 

4 

Output  bcast(m)n 

m  E  bcastq 

Output  VtoVsend(t;,  m)u,  v  E  U 

Effect: 

24 

6 

bcastq  <—  bcastq  \  {m} 

State: 

26 

8 

vtovq ,  a  queue  of  tuples  from  U  x  msg ,  initially  0 

Output  VtoVsend(r»,  m)u 

bcastq,  a  queue  of  messages,  initially  0 

Precondition: 

28 

10 

(v,  m)  E  vtovq 

Trajectories: 

Effect: 

30 

12 

satisfies 

vtovq  vtovq  \  {(qreg,m)} 

constant  vtovq ,  bcastq 

32 

14 

stops  when 

Input  brcv(((sdata,  m,q,qreg),p,v))u 

Any  precondition  is  satisfied. 

Effect: 

34 

16 

if  v  E  nbrs(u)  U  {w}  then 

function  calcregs(i;:  U):  2U  = 

let  qregions  =  calcregs  {qreg)  in 

36 

18 

return  nbrs(v)  U  {n} 

for  each  v  E  qregions  U  nbrs(qregions) 

vtovq  vtovq  U  {(qreg,  (data,  m,  q))} 

38 

Input  VtoVrcv((data,  m,p))u 

40 

Effect: 

bcastq  <—  bcastq  U  {((rdata,  m),p,u)} 

42 

Figure  11:  EtoEComm’s  automaton. 

the  forwarding  of  the  message  to  p  via  bcast(((rdata,  m),p,u))u  (line  42). 

6.3  Correctness 

We  make  the  system  assumptions  described  in  Section  3.  Correctness  of  the  EtoEComm  implementation 
is  dependent  on  assumptions  about  client  mobility  and  the  definition  of  the  function  calcregs,  used  in  the 
EtoEComm  VSA  algorithm.  We  can  prove  correctness  under  either  of  the  following  two  conditions: 

(1)  calcregs(gre(7)  returns  the  set  containing  qreg  and  its  neighbors,  and  each  client  remains  in  a  region  at 
least  esampie  +  MtlytoV  +  5e  +  4 d+  ttlpb  time  before  moving  to  a  neighboring  region,  or 

(2)  calcregs  (qreg)  returns  the  set  containing  qreg  and  each  region  v  such  that  the  supremum  distance  between 
any  two  points  in  v  and  qreg  is  at  most  vmax  •  (eso mpie  +  SttlytoV  +  5e  +  4d  +  ttlpb). 

We  then  outline  correctness  for  EtoEComm  under  these  assumptions.  For  the  first  lemma  and  theorem, 
assume  we  start  in  a  safe  configuration  and  no  corruption  failures  occur. 

Lemma  6.1  Consider  an  alive  client  q  such  that  some  other  client  p  has  a  non-null ,  non-A.  entry  for 
phbook(q).reg.  If  q  does  not  fail  for  an  additional  2d  +  2e  +  ttlytoV  time,  then  at  any  point  in  that  interval, 
q  will  be  located  in  a  region  in  calcregs (phbook(q).reg). 

Proof  sketch:  First,  we  note  that  a  non-null,  non-_L  entry  phbook{q).reg  has  information  that  is  at  most 
e sample  +  2 ttlytoV  +  3e  +  2 d  out-of-date  (from  HLS)  when  it  is  first  installed,  after  which  it  is  saved  for  an 
additional  ttlpb  time. 

If  we  are  assuming  condition  1,  client  q  must  be  in  the  region  indicated,  or  a  neighboring  region,  and 
will  remain  in  those  regions  for  an  additional  2d  +  2e  +  MytoV  time.  If  we  are  assuming  condition  2,  at 
any  point  up  to  2d  +  2e  +  ttlvt.oV  later,  client  q  can  be  in  any  region  reachable  from  qreg  in  the  total 
e sample  +  SttlytoV  +  5e  +  4 d  +  ttlpb  time,  when  traveling  at  speed  vmax.  ■ 

Theorem  6.2  Consider  a  client  p  that  performs  a  send((7,  m),  and  does  not  change  regions  for  Mhls  time. 
If  client  q  has  been  in  the  system  for  Mhls  +  e sample  +  d  +  ttlytoV  +  e  time  and  does  not  fail,  then  q  will 
perform  a  receive(m)  within  Mhls  +  2d +2e  + ttlytoV  time.  If  a  client  receives  a  message,  it  must  previously 
have  been  sent  to  it. 

Theorem  6.3  Starting  from  an  arbitrary  configuration,  after  HLS  has  stabilized,  it  takes  ttlpb  +  2d  +  2e  + 
ttlvt.oV  time  for  EtoEComm  to  stabilize. 

Proof  sketch:  Bad  region  information  can  be  in  phbook  for  up  to  ttlpb  time,  and  messages  sent  using  this 
information  are  not  delivered  and  cleared  until  up  to  d+e+ttlvtoV  +  e+d  later.  At  the  same  time,  while  HLS 
has  been  stabilizing,  phbook' s  message  collection  can  take  up  to  Mhls  time  to  be  cleared.  The  maximum 
of  these  quantities  is  the  time  for  EtoEComm  to  stabilize.  ■ 
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6.4  Extensions 

Here  we  briefly  describe  some  possible  extensions  to  our  EtoEComm  algorithm: 

Routing  optimizations:  Once  the  location  of  a  client  is  known,  communication  with  the  client  can  be 
continued  directly,  and  movements  during  the  conversation  may  be  piggy-backed  on  the  information  trans¬ 
ferred  in  order  to  update  the  destination  according  to  the  move  (as  suggested  [12]).  We  also  note  that  we 
can  use  an  embedded  tree  location  scheme  such  as  the  one  in  [12] ,  implemented  by  virtual  automata,  where 
intermediate  tree  nodes  are  also  mapped  to  regions. 

Sleeping  client  messaging  service:  Mobile  clients  might  be  able  to  shut  down  to  conserve  power.  We 
could  guarantee  that  a  sleeping  client  eventually  receives  messages  intended  for  it  by  having  local  VSAs  save 
the  messages.  The  VSAs  then,  at  predefined  times,  broadcast  the  messages.  Sleeping  clients  awake  for  these 
broadcasts,  receive  their  messages,  and  can  go  to  sleep  again  afterwards. 

7  Concluding  remarks 

We  described  how  both  the  GPS  oracle  and  the  VSA  programming  layer  could  help  implement  self-stabilizing 
geocast  routing,  location  management,  and  end-to-end  routing  services.  The  self-stabilizing  VSA  layer 
provides  a  virtual  fixed  infrastructure  useful  for  solving  a  variety  of  problems.  It  acts  as  a  fault-tolerant, 
self-stabilizing  building  block  for  services,  allowing  applications  to  be  built  for  mobile  networks  as  though 
base  stations  existed  for  mobile  clients  to  interact  with. 

The  GPS  oracle’s  frequently  refreshed  and  reliable  timing  and  location  information  made  providing  self¬ 
stabilization  easier.  We  believe  the  paradigm  of  an  external  service  providing  reliable  information  that  can 
be  used  in  a  self-stabilizing  service  implementation  is  an  especially  important  and  relevant  one  in  mobile 
networks.  Mobile  networks  demonstrate  many  properties  that  naturally  require  self-stabilizing  implemen¬ 
tations,  such  as  a  need  for  self-configuration,  or  the  possibility  of  unpredictable  kinds  of  failures,  but  also 
often  have  access  to  reliable  external  knowledge  that  can  act  as  a  source  of  shared  consistency  in  the  net¬ 
work;  here,  accurate  region  knowledge  allowed  nodes  to  determine  who  they  should  be  communicating  with 
(current  region  and  neighboring  region  nodes),  and  time  information  allowed  them  to  order  messages  and 
assess  timeliness  of  information. 
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